Active and Semi-supervised Data Domain Description
نویسندگان
چکیده
Data domain description techniques aim at deriving concise descriptions of objects belonging to a category of interest. For instance, the support vector domain description (SVDD) learns a hypersphere enclosing the bulk of provided unlabeled data such that points lying outside of the ball are considered anomalous. However, relevant information such as expert and background knowledge remain unused in the unsupervised setting. In this paper, we rephrase data domain description as a semi-supervised learning task, that is, we propose a semi-supervised generalization of data domain description (SSSVDD) to process unlabeled and labeled examples. The corresponding optimization problem is nonconvex. We translate it into an unconstraint, continuous problem that can be optimized accurately by gradient-based techniques. Furthermore, we devise an e ective active learning strategy to query low-con dence observations. Our empirical evaluation on network intrusion detection and object recognition tasks shows that our SSSVDDs consistently outperform baseline methods in relevant learning settings.
منابع مشابه
Active Annotation
This paper introduces a semi-supervised learning framework for creating training material, namely active annotation. The main intuition is that an unsupervised method is used to initially annotate imperfectly the data and then the errors made are detected automatically and corrected by a human annotator. We applied active annotation to named entity recognition in the biomedical domain and encou...
متن کاملCombining Committee-Based Semi-supervised and Active Learning and Its Application to Handwritten Digits Recognition
Semi-supervised learning reduces the cost of labeling the training data of a supervised learning algorithm through using unlabeled data together with labeled data to improve the performance. Co-Training is a popular semi-supervised learning algorithm, that requires multiple redundant and independent sets of features (views). In many real-world application domains, this requirement can not be sa...
متن کاملA New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain
Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...
متن کاملSemi-supervised subclass support vector data description for image and video classification
In this paper, an One-Class Classification method, namely the Semi-Supervised Subclass Support Vector Data Description, is presented. The proposed method extends Support Vector Data Description by two means, i.e. by exploiting global class information expressed by the class data variance and local neighborhood information between all available (labeled and unlabeled), following the smoothness a...
متن کاملSemi-supervised Ultrasound Image Segmentation Based on Direction Energy and Texture Intensity
For the ultrasound images accurate segmentation problem, this paper proposes a novel SVM semi-supervised segmentation method based on major features in curvelet domain. Firstly, ultrasound images were decomposed into different directions and frequencies in the curvelet domain, then the cauchy model was used to simulate curvelet coefficients distribution, thus the main distribution of the curvel...
متن کامل